test(client): add pathspec coverage for Argo and Step Functions run IDs by dheerenmohta · Pull Request #3264 · Netflix/metaflow

dheerenmohta · 2026-06-11T09:01:07Z

Problem

test_pathspec_validation.py (added in #948) only covers numeric run IDs. This means the critical regression where Run('MyFlow/argo-...') raises MetaflowInvalidPathspec has zero test coverage and would go undetected.

Fix

Add two new test methods:

`test_run_valid_orchestrator_run_ids`

Verifies that the following formats are accepted as valid run pathspecs:

"MovieStatsFlow/argo-moviestatsflow-68z2h" (matches the tutorial example in 08-autopilot)
"MyFlow/sfn-my-execution-name" (AWS Step Functions format)

`test_task_valid_orchestrator_run_and_task_ids`

Same coverage for task-level pathspecs where the run ID is an orchestrator-style string.

Also removes test cases from test_run_invalid_pathspecs that asserted orchestrator-style strings (e.g. "MyFlow/abc") should be rejected — those cases are now intentionally valid.

Companion to PR #1 (fix/client-argo-sfn-run-id-validation).

Relates to #948

Fixes Netflix#948 Previously, pathspec validation only checked the number of components (e.g., "FlowName/RunID" has 2 parts). This caused confusing errors when users made typos or used invalid characters, since the error would come later from the metadata provider rather than at creation time. This change adds proper format validation: - Flow/Step/Artifact names must be valid identifiers (start with letter/underscore) - Run IDs and Task IDs must be numeric - Empty components are rejected (e.g., "Flow//Step") - Leading/trailing slashes are rejected The validation provides clear, actionable error messages that tell users exactly what's wrong with their pathspec. For example: - "Invalid flow name '123Flow'. Flow names must start with a letter..." - "Invalid run ID 'abc'. Run IDs must be numeric." Added comprehensive unit tests covering valid and invalid cases.

The existing test suite only covered numeric run IDs, which meant Bug 1 (non-numeric orchestrator run IDs raising MetaflowInvalidPathspec) had no regression coverage. Add test_run_valid_orchestrator_run_ids and test_task_valid_orchestrator_run_and_task_ids to explicitly exercise the "argo-*" and "sfn-*" run ID formats that Argo Workflows and AWS Step Functions produce respectively. Also tighten test_run_invalid_pathspecs to remove cases that assumed purely numeric run IDs (e.g. "MyFlow/abc") since those are now valid orchestrator-style IDs. Relates to Netflix#948

greptile-apps · 2026-06-11T09:04:37Z

Greptile Summary

This PR adds per-component pathspec validation to MetaflowObject._validate_pathspec_format and a new test file with coverage for Argo Workflows and AWS Step Functions run IDs. However, the core fix is incomplete: _NUMERIC_ID_PATTERN in core.py still enforces digits-only for run IDs, so the two new orchestrator-ID test methods will fail against the code shipped in this same PR.

metaflow/client/core.py: Introduces _validate_pathspec_format that validates each pathspec component by position, but the run-ID branch still uses ^[0-9]+$, rejecting argo-... and sfn-... strings the PR intends to allow.
test/unit/test_pathspec_validation.py: Adds 270 lines of unit tests; test_run_valid_orchestrator_run_ids and test_task_valid_orchestrator_run_and_task_ids will raise MetaflowInvalidPathspec (not pass) with the current validator; test_error_messages_are_helpful also hard-codes "numeric" in the error-message assertion, which will need updating once the regex is widened.

Confidence Score: 3/5

The two new orchestrator-ID tests will fail immediately because the run-ID regex in core.py was never updated to accept non-numeric prefixes; merging as-is adds broken tests to the suite.

The added validation method in core.py still uses a digits-only pattern for run IDs, so the very tests this PR introduces to prove the Argo/SFN fix will raise MetaflowInvalidPathspec instead of passing. The companion regex change is absent from core.py, leaving the test file and the production code in direct contradiction.

Both changed files need attention: metaflow/client/core.py needs the run-ID pattern widened to accept orchestrator prefixes, and test/unit/test_pathspec_validation.py needs the "numeric" error-message assertion updated to match the new pattern.

Important Files Changed

Filename	Overview
metaflow/client/core.py	Adds `_validate_pathspec_format` static method with per-component regex validation, but `_NUMERIC_ID_PATTERN` (digits-only) is still used for run IDs, so the orchestrator-ID tests added in this same PR will fail immediately.
test/unit/test_pathspec_validation.py	New test file with good coverage, but `test_run_valid_orchestrator_run_ids` and `test_task_valid_orchestrator_run_and_task_ids` will fail against the current code; `test_error_messages_are_helpful` also hard-codes "numeric" in the error-message assertion which conflicts with the intended broader validation.

_{Reviews (1): Last reviewed commit: "test(client): add pathspec tests for Arg..." | Re-trigger Greptile}

greptile-apps · 2026-06-11T09:04:41Z

+# Flow names and step names should be valid Python identifiers (start with letter/underscore)
+_FLOW_NAME_PATTERN = re.compile(r'^[a-zA-Z_][a-zA-Z0-9_]*$')
+# Run IDs and Task IDs are numeric
+_NUMERIC_ID_PATTERN = re.compile(r'^[0-9]+$')


Run ID pattern still numeric-only, breaking the new tests

_NUMERIC_ID_PATTERN only matches ^[0-9]+$, so _validate_pathspec_format at lines 347–351 will raise MetaflowInvalidPathspec for every Argo/SFN run ID (e.g. argo-moviestatsflow-68z2h, sfn-my-execution-name). test_run_valid_orchestrator_run_ids and test_task_valid_orchestrator_run_and_task_ids both call the same method and expect those pathspecs to return successfully — those tests will fail against this PR's code as written. The companion fix that widens the run-ID pattern (or introduces a separate _ORCHESTRATOR_RUN_ID_PATTERN) is not present in core.py here.

greptile-apps · 2026-06-11T09:04:42Z

+        # Test invalid run ID
+        with self.assertRaises(MetaflowInvalidPathspec) as cm:
+            MetaflowObject._validate_pathspec_format("MyFlow/abc", "run")
+        error_msg = str(cm.exception)
+        self.assertIn("abc", error_msg)
+        self.assertIn("run ID", error_msg)
+        self.assertIn("numeric", error_msg.lower())


Error-message assertion contradicts orchestrator-ID intent

This block asserts that "MyFlow/abc" raises with "numeric" in the message. Once the validation is fixed to accept argo- / sfn- prefixed IDs (the missing companion change), the error message "Run IDs must be numeric" will no longer be accurate for the broader class of valid formats, and assertIn("numeric", ...) will create a confusing contract. Consider updating this assertion to match whatever the new error message says (e.g. "must be a valid run ID") at the same time the regex is widened.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

greptile-apps · 2026-06-11T09:04:43Z

+    def test_step_invalid_pathspecs(self):
+        """Test that invalid step pathspecs are rejected."""
+        invalid_cases = [
+            ("MyFlow/123", "too few components"),
+            ("MyFlow/123/start/extra", "too many components"),
+            ("MyFlow/abc/start", "non-numeric run ID"),
+            ("MyFlow/123/123step", "step name starts with number"),
+            ("MyFlow/123/my-step", "step name contains dash"),
+            ("MyFlow//start", "empty run ID"),
+            ("MyFlow/123//", "empty step name"),
+        ]
+        for pathspec, description in invalid_cases:
+            with self.subTest(pathspec=pathspec, reason=description):
+                with self.assertRaises(MetaflowInvalidPathspec):
+                    MetaflowObject._validate_pathspec_format(pathspec, "step")


Step-level orchestrator run ID not covered

test_step_invalid_pathspecs keeps ("MyFlow/abc/start", "non-numeric run ID") as a rejection case, but there is no positive test for a step pathspec that contains an orchestrator run ID (e.g. "MyFlow/argo-myflow-abc12/start"). test_task_valid_orchestrator_run_and_task_ids covers the task level, but the step and artifact levels have the same underlying validation and should also get orchestrator-ID positive cases to prevent future regressions.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

codecov · 2026-06-11T10:06:47Z

Codecov Report

❌ Patch coverage is 94.59459% with 2 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (master@6a634bd). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
metaflow/client/core.py	94.59%	0 Missing and 2 partials ⚠️

Additional details and impacted files

@@            Coverage Diff            @@
##             master    #3264   +/-   ##
=========================================
  Coverage          ?   29.04%           
=========================================
  Files             ?      381           
  Lines             ?    52539           
  Branches          ?     9277           
=========================================
  Hits              ?    15260           
  Misses            ?    36252           
  Partials          ?     1027

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

dheerenmohta added 2 commits February 19, 2026 07:15

greptile-apps Bot reviewed Jun 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

test(client): add pathspec coverage for Argo and Step Functions run IDs#3264

test(client): add pathspec coverage for Argo and Step Functions run IDs#3264
dheerenmohta wants to merge 2 commits into
Netflix:masterfrom
dheerenmohta:test/client-pathspec-orchestrator-run-ids

dheerenmohta commented Jun 11, 2026

Uh oh!

greptile-apps Bot commented Jun 11, 2026

Uh oh!

greptile-apps Bot Jun 11, 2026

Uh oh!

greptile-apps Bot Jun 11, 2026

Uh oh!

greptile-apps Bot Jun 11, 2026

Uh oh!

codecov Bot commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

dheerenmohta commented Jun 11, 2026

Problem

Fix

test_run_valid_orchestrator_run_ids

test_task_valid_orchestrator_run_and_task_ids

Uh oh!

greptile-apps Bot commented Jun 11, 2026

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Uh oh!

greptile-apps Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented Jun 11, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

`test_run_valid_orchestrator_run_ids`

`test_task_valid_orchestrator_run_and_task_ids`